F-012: perf(services): cache compiled built-in secret regex patterns#16
Open
Sephyi wants to merge 1 commit intodevelopmentfrom
Open
F-012: perf(services): cache compiled built-in secret regex patterns#16Sephyi wants to merge 1 commit intodevelopmentfrom
Sephyi wants to merge 1 commit intodevelopmentfrom
Conversation
`build_patterns()` is called every time the configured custom/disabled secret-pattern lists change, and previously re-ran `builtin_patterns()` which compiled all 24 built-in regexes from scratch on each call. Cache the built-in `Vec<SecretPattern>` in a `LazyLock` so the 24 regexes are compiled exactly once per process. `build_patterns()` now clones cheap entries (Regex uses internal Arc sharing; the string fields are `Cow<'static, str>`) from the cached slice rather than recompiling. The filter semantics for `disabled_secret_patterns` are preserved. This also consolidates the separate `DEFAULT_PATTERNS` `LazyLock` into the single `BUILTIN_PATTERNS` cache; `scan_for_secrets` and `scan_full_diff_for_secrets` now reference it via `builtin_patterns()` which returns a `&'static [SecretPattern]`. The public API of `build_patterns()` is unchanged. Closes audit entry F-012 from #3.
There was a problem hiding this comment.
Pull request overview
This PR addresses audit finding F-012 by avoiding repeated compilation of the built-in secret-scanning regex patterns, improving performance for repeated secret scans.
Changes:
- Cache the 24 built-in
SecretPatternregexes in a process-wideLazyLockand expose them via a slice. - Make
SecretPatternCloneso callers can cheaply clone cached patterns when building a mutable pattern list. - Update default secret-scan entrypoints to use the cached built-in patterns directly.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+43
to
+51
| let mut patterns: Vec<SecretPattern> = if disabled.is_empty() { | ||
| builtin.to_vec() | ||
| } else { | ||
| let disabled_lower: Vec<String> = disabled.iter().map(|s| s.to_lowercase()).collect(); | ||
| patterns.retain(|p| !disabled_lower.contains(&p.name.to_lowercase())); | ||
| } | ||
| builtin | ||
| .iter() | ||
| .filter(|p| !disabled_lower.contains(&p.name.to_lowercase())) | ||
| .cloned() | ||
| .collect() |
There was a problem hiding this comment.
The disabled-pattern filtering path does repeated Vec::contains lookups and recomputes p.name.to_lowercase() for every built-in pattern. Consider using a HashSet<String> for the lowercased disabled names to avoid repeated linear scans/allocations (even if the built-in set is small today).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
perf(services): cache compiled built-in secret regex patterns.
Audit context
Closes audit entry F-012 from #3.
Verification
cargo fmt --checkcargo clippy --all-targets --all-features -- -D warningscargo test --all-targets